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Abstract 



A Monte Carlo study was used to examine the Type I error rates of five multivariate tests for the single- 
factor repeated measures model. The performance of Hotelling's T^ and four nonparametric tests, including a chi- 
square and an F test version of a rank-transform procedure, was investigated for different distributions, sample 
sizes, and numbers of repeated measures. The results indicated that both Hotellings T^ and the F test version of 
the rank- transform performed well, producing Type I error rates which were close to the nominal value. The chi- 
square version of the reink-transform test, on the other hand, performed poorly for virtualh all conditions studied. 
The performance of the other nonparametric tests depended heavily on sample size. Based on these results. 
Hotelling's T^ is recommended for the single-factor repeated measures model. 
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An Empirical Study of the Type I Error Rates of Five Multivariate Tests for the 
Single-Factor Repeated Measures Model 

Experimental settings in which i = 1, 2, N subjects (blocks) are measured on P occasions with the same 
variable are often referred to as repeated measures designs. We consider the unreplicated design in which subjects 
are treated as a random effect and the repeated factor as a fixed effect. It is well known that both univariate and 
multivariate normal-theory tests of the (main) effect associated with the repeated factor can be performed, and 
that both procedures require that the N vectors of errors be independently and (multivariate)-normally distributed 
(Bock, 1975). 

The univariate approach also requires that the covariance matrix of the repeated measures Z possess 
sphericity, which exists in the sample if the statistic 

£ =r (trCTC)^ (1) 

(P-l)tr(C'ZC)^ 

equals one; otherwise, the data show some degree of nonsphericity. (The lower bound of £ indicating maximum 
lack of sphericity is (P-'1)‘0. In equation (1), C is a (P-1) x P matrix of coefficients defining a collection of 
orthonormalized contrasts and tr is the trace operator (Box, 1954). If e = 1, or, in practice, is quite close to 1, the 
univariate F test is often recommended because of its greater power relative to the multivariate approach (Huynh 
& Feldt, 1970; Rouanet & Lepine, 1970) However, use of the univariate F when sphericity is violated is known 
to effect the Type I error rate of F, typically producing inflated error rates (Boik, 1981; Collier, Baker, Mandeville, 
& Hayes, 1967; Huynh & Feldt, 1980; Mendoza, Toothaker, & Nicewander, 1974). Complicating matters is the 
problem that repeated measures data can be expected to be nonspherical (Greenwald, 1976; O'Brien & Kaiser, 1985; 
Romaniuk, Levin, & Hubert, 1977; Wilson, 1975), with £ values often between .75 and .85 (Huynh & Feldt, 1976). 

One alternative in the face of nonspherical data is to use an adjusted univariate F test (c.f., Huynh, 1978; 
Quintanna & Maxwell, 1994; Rogan, Keselman, & Mendoza, 1979); another is to use a multivariate test which 
makes no assumption about the structure of Z. Indeed, several authors (e.g.. Cole & Grizzle, 1966; Lewis, 1993; 
Marascuilo & Levin, 1983, p. 381; Maxwell & Delaney, 1990, p. 591; Peng, 1975) have expressed a preference for 
the multivariate approach, a preference with which we concur. Still, opting for a multivariate test does not settle 
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things because several such tests are available, including the normal-theory Hotelling's and various 
nonparametric tests which do not require normality of the distribution of errors. 

Description of the Problem 

Akritas and Arnold (1994) provided a theoretical justification for (nonparametric) rank-transform (RT) tests 
for several designs, including the single-factor repeated measures design. Rank-trauisform tests rank the raw 
scores and perform normal-theory tests on the ranks with no assumption that the data are normally distributed. 
The Akritas and Arnold test uses Hotelling's computed for ranked data. Interestingly, they made no mention 
of the fact that their form of is the same as that proposed by Agresti and Pendergast (1986). 

Despite the intuitive appeal and ease of use of RT tests, and the fact that the RT procedure has been 
embraced in the documentation of the SAS (SAS Inc., 1985, p. 647) statistical analysis program, use of the test 
proposed by Akritas and Arnold should not go unchallenged for at least three reasons. First, there is some 
evidence that the Hotelling T^ is robust under certain conditions, and, thus, can be used with some nonnormal 
distributions. If competing normal-theory and nonparametric tests show the same statistical behavior for realistic 
datasets (e.g., nonnormal data), we would opt for the normal-theory test. Second, the validity of RT tests has been 
questioned, with several papers (e.g., Blair, Sawilosky, & Higgins, 1%7; Fligner, 1981; Sawilosky, Blair, & Higgins, 
1989) providing evidence of the shortcomings of these tests in certain set^'ings. Third, there are other 
nonparametric tests which can be used in the single-factor repeated measures design and as such represent 
important data-analytic alternatives. 

In short, before the Akritas and Arnold RT test can be recommended over its competitors there must be 
evidence supporting its superior statistical properties for realistic data conditions. This paper reports the results 
of a Monte Carlo study of the Type 1 error rates (a) of five multivariate tests for the .single-factor repeated 
measures model: Hotelling's T^, two versions of Akritas and Arnold's RT test, a test due to Puri and Sen (1969), 
and a multivariate Wilcoxon signed-ranks test (Bickel, 1965; Hettmansperger, 1984, pp. 283-285). Univariate RT 
tests for the repeated measures model (e.g., Kepncr & Robinson, 1988) are not considered. 
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Data Model and Statistical Tests 



Following Davidson (1980), tke linear model assumed to underlie the (continuous) data is 

Yip = H + "Cp + Eip/ (2) 

where Ip Xp = 0; E(eip)=0; cov(eip,ej.p.)= 5;,. Opp. where 5ji.=l if i=i'and 0 otherwise and Opp. is the covariance. In 
equation (2), y is the observed score of the ith subject on the Ptii repeated measure, p is a grand mean, Tp is a 
treatment effect defined as Pp - p, and ejp is an error term. We assume that covariances among the errors are 
collected in the matrix I. All of the tests assume that the N vectors of errors are independently distributed. 

The hypothesis tested by Hotelling's is Ho: Xj = T 2 = •• = The form of the the test statistic is 
T" = N(CY)'(CSC')-'(CY) (3) 

For convenience this test is often transformed into an F: 

F = (N-P-f 1) (4) 

(P-1) (N-1) 

Under Ho, the above statistic is distributed as an F with P-1 and N-P+1 degrees of freedom if the N error vectors 
are multivariate-normally distributed. 

The RT procedure of Akritas and Arnold (1994) tests the hypothesis of homogeneity of the marginal 
distribution functions Ho: Fj(y)= Fj(y)= ...=Fp(y); rejection of this hypothesis implies, but does not guarantee, 
differences among location parameters. (All of the nonparametric tests in this paper share this null hypothesis). 
To compute the chi-square version of the Akritas and Arnold test (AACHl) the NP raw scores are ranked from 
1 to NP, T\t is computed on the ranks, and 

AACHI = (P-1) (N-P+1) ~ xV. • (5) 

(N-1) 

Under f lo, the resulting test statistic is asymptotically distributed as a chi-sqi.are variable with P-1 degrees of 
freedom. Agresti and Pendergast (1986) recommended computing AAF = AACHl’^(P-l) and comparing this value 
against an F critical value based on P-1 and N-P+1 degrees of freedom. An advantage of the AACHI and AAF 
tests is that standard statistical analysis programs can be used to obtain 7^^^; one simply submits the ranks to a 
program that computes T* for repeated measures models. (For all of the nonparametric tests, ties among the raw 
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scores are handled by assigning midranks, which should not have an adverse effect on these tests unless the 
proportion of ties is large (Lehmann, 1975, p. 18)). 

The general linear model procedure due to Puri and Sen (1969) suggests another nonparametric test in the 
multivariate repeated measures model. Here P-1 differences are created and ranked from 1 to N(P-l). The 
hypothesis of homogene r,; of the Fp(y) can be tested with 
PS = (N-1)0 -xVi 

0 is the eigenvalue obtained from the matrix product of equation (2.26) in Puri and Sen (1969) involving the 
between-measure cross-products matrix H and the total cross-products matrix T, and is obtained as the solution 
to the Pillai-Bartlett eigenvalue problem Under the null hypothesis of homogeneity of marginal distribution 
functions, PS is asymptotically distributed as a chi-square variable with P-1 degrees of freedom. 

Another alternative is the multivariate Wilcoxon signed-rank (MWSR) test due to Bickel (1965). Although 
numerous variations of this procedure have been suggested (e.g., Policello & Hettmansperger, 1976; Utts & 
Hettmansperger, 1980), we study the traditional form of the MWSR test in which P-1 Wilcoxon signed-rank 
statistics are computed and a test statistic is formed from this vector and the covariances among the signed-rank 
statistics. The test statistic is compared to a chi-square variable with P-1 degrees of freedom. 

Akritas and Arnold (1994) used data from Johnson and Wichem (1988, p. 219) to illustrate the computations 
for the AaCHI test. We use the same data to illustrate the computations for each of the tests in Appendix A. 






Review of the Literature 

Surprisingly few studies of multivariate tests for the single-factor repeated measures model have been 
reported. As m.ight be expected, most of these have investigated Hotelling's T^. 

Jensen (1982) used analytic methods to show that T^ maintains it Type I error rate for a variety of nonnormal 
distributions (e.g., t, Cauchy) if some general criteria involving the shape of the distribution are satisfied. Jensen's 
results help to explain Monte Carlo findings indicating that the Type I error rate of T^ is robust to symmetric but 
nonnormal distributions (e.g.. Chase & Bulgren, 1971; Serlin & Harwell, 1989; Utts & Hettsmanperger, 1980) and 
to mild skewmg (e.g., Everiit, 1979). Increasingly asymmetric distributions, on tl^e other hand, ha 'e sometimes 
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produced mflated error rates, even as sample size inc* 'ses (Chase & Bulgren, 1971; Everitt, 1979). For example, 
Everitt (1979) reported error rates of .14 for a = .05 for an exponential distribution, and .30 for a log:, ^’-mal. 
Everitt also reported tliat increasing sample sizes (5, 10, 15, 20) had a limited effect on error rates. Chase ana 
Bulgren's (1971) results for P = 3 showed a similar pattern for for sample sizes of 5, 10, and 20. However, 
Berlin and Harwell (1989) found that was robust for exponential data and a sample size of 30, and concluded 
"Unlike many simulation experiments, the Type I error results were quite unambiguous, and, for the conditions 
of this study, provide a textbook example of a robust test." (p. 13) This discrepancy among studies of the 
robustness of T^ distributions persists for both equal and unequal be tween-measure correlations. 

Few Monte Carlo studies of nonparametric tests for the repeated measures model have been reported. 
Agresti and Pendergast (1986) found that the AAF test maintained its Type I error rate for a multivariate-normal 
distribution for sample sizes of 10, 30 and 50 and P = 2 versus 5 repeated measures. Berlin and Harwell (1989) 
reported similar findings for the AAF test for N = 30, 100 and P = 3, 4 for a normal, double-exponential, and 
exponential distributions. Berlin and Harwell also reported that the error rates of the PB test under these 
conditions were quite conservative. 

Design of the Monte Carlo Btudy 

Ideally, the Type I error behavior of the various tests would be investigated analytically. However, such 
solutions are difficult because they almost always require multivariate-normality, the very assumption that 
empirical data can be expected to violate. In addition, the nonparametric procedures are large sample tests, and 
their behavior for small samples must be investigated empirically. We settled for a Monte Carlo study comparing 
the Type I error rates of the five tests. 

Hoaglin and Andrews (1975), Lewis and Orav (1989), and others have argued that Monte Carlo studies 
should be subject to the same principles of experimental design and data analysis as empirical studies. 
Accordingly, the design of our simulation study was an unreplicated 5 (type of distribution) x 3 (sample size) x 
2 (number of repeated measures) fixed effects, fully-crossed factorial. Type of distribution, sample size, and 
number of repeated measures served as independent variables and the empirical Type I error rates as the 
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dependent variable. This design made it possible to examine the empirical error rates for evidence of interactions 
among the simulation factors and to estimate the magnitude of significant effects. 

The simulation factors and factor levels were selected because of their known (or suspected) effects on the 
Type I error rates of one or more of the tests, and because these factors have been used in previous Monte Carlo 
studies of the repeated measures model. Table 1 outlines the factors and their levels which were manipulated. 
The focus on the effects of increasing asymmetry arose from the effect of this factor in previous Monte Carlo 
studies of the T^ test. The y, (skewness) = yj (kurtosis) = 0 case produced normally distributed data which acted 
as a baseline against which other results could be compared, whereas iricrements of .5 for permitted the 
detection of trends in the empirical error rates for increasingly skewed data (yj was not a focus of the simulation 
study because there is little evidence that it affects tests of location). A = .5, yj = 1.5 pairing produced a mildly 
skewed and somewhat leptokurtic distribution, yj = 1, yj = 3 a moderately skewed and leptokurtic distribution 
which is equal to a chi-square with v = 8 degrees of freedom, y^ = 1.5, yj = 4.5 a skewed and leptokurtic 
distribution, and yj = 2, yj = 6 a badly skewed and peaked distribution which is equal to a chi-square with v = 
2, or, equivalently, an exponential distribution. The chosen sample sizes of 9, 15, and 30 were intended to reflect 
quite small to moderate sample sizes that have been used in previous Monte Carlo studies of this model (Chase 
& Bulgren, 1971; Everitt, 1979; SerlLn & Harwell, 1989). The same resoning led to the selection of the P = 3, 4 
numbers of repeated measures. 

Data Generation 

A Gateway DX2/50 microcomputer was used to generate data. All programming was done in FORTRAN 
IV and was supplemented by subroutines written by the second author. The random number generator was taken 
from Numerical Recipes (Press, Flannery, Teukolsky, & Vetterling, 1986), with model (1) serving as the underlying 
data generation model. The following steps were followed to generate data: (a) NP scores representing 
multivariate-normal data were simulated using the Kaiser and Dickman (1962) procedure and, when appropriate, 
were transformed to nonnormal data using the method of Vale and Maurclli (1983). Habib and Harwell (1989) 
provide details on using the Vale and Maurelli procedure, which combines the Kaiser and Dickman approach with 
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Fleishman's (1978) procedure for generating nonnormal data through skewness and kurtosis parameters. In all 
cases, the between-measure correlations equaled .5 and T| equaled 0. (b) Step (a) was repeated 10,000 times and 
for each replication the T^, AACHI, AAF, PS and MWSL^ tests were computed and the test statistics compared to 
the appropriate critical value for the .05 and .01 levels of significance. 

Results 

Adequacy of the Data Generation 

The adequacy of the data generation was judged by examining the average skewness, kurtosis, and 
correlation values computed for the simulated data for each combination of conditions, as well as across all 
conditions. Results for the N = 9, P = 3 case for the various distributions are reported in Table 2, along with 
overall summary statistics. We report the N = 9, P = 3 case because problems in producing data with the desired 
properties are likely to be most acute for smaller sample sizes. The results m Table 2 suggest that the simulated 
data possessed (approximately) the desired marginal skewness, kurtosis, and correlation values. A similar pattern 
was observed for the larger sample size conditions. 

Analysis of the Empirical Type I Error Rates 

The empirical Type I error rates are reported in Table 3. Because of the similarity of the results for the .01 
and .05 levels, only the latter are reported. The expression .05 ±1.96[(.05(l-.05))/10,000]^^^ was used to establish 
a sampling error range for the empirical proportions of rejections. Error rates exceeding the upper limit of .054 
were considered to be inflated and are indicated in Table 3 by a *, and error rates below the lower limit of .046 
were considered to be conservative and are indicated by a **. 

The results in Table 3 suggests the following conclusions: (a) Hotelling's T^ and the AAF test did the best 
job of controlling Type I error rates near the nominal value, (b) The AACHI test performed particularly i^oorly, 
(c) The PS test was extremely conservative and the MWSR test somewhat less so for larger samples. 

It is possible that simple descriptive analyses of the empirical error rates may conceal important information 
such as the presence of interactions among simulation factors with respect to the empirical Type I errors. 
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Accordingly, the error rates were analyzed for each test using an unreplicated, three-factor, completely between- 
subjects ANOVA. The three-way interaction variation was used as an estimate of error. There is some evidence 
that using the highest-order interaction term in this fashion in the analysis of Monte Carlo results has little effect 
on the results (Alaysin, 1991). Those results which were significant at the .05 level and whose estimate of effect 
size exceeded .10 are reported in Table 4. Effect sizes were estimated using Fisher's correlation ratio (sum of 
squares for that effect divided by tlie sum of squares total) and the statistic (Hays, 1973, p. 485). Because the 
rf and O)^ indices did not differ by more than .02 on any effect, only is reported in Table 4. The two effects 
whose T|‘ was < .10 (.03 and .07) were deemed too small to pursue further. 

Interestingly, all of the significant effects reported in Taole 4 are main effects, and all produced at least 
moderate and occasionally quite large rf values. Only Hotelling's T^ was sensitive to type of distribution, a result 
which is consistent with the Monte Carlo results of Chase and Bulgren (1971) and Everitt (1979); however, the 
marginal mean error rates for T‘ of = .049, = .048, = .046, = .047, and Yvi^ 2 ,y 2=6 

= .041 suggests that r|^ depended heavily on error rates associated with an exponential distribution. In fact, the 
distribution effect is not significant if error rates for the exponential distribution are deleted. The Type I error 
rates of the AACHI, PS, and MWSR tests proved to be sensitive to sample size, producing marginal means of 
Yj ,.^9 = .127, Yjs,=i 5 = .092, and Y^^ao = .069 for the AACHI test, Yj ^.^9 = .001, Y^^,^ = .006, and Ym^ 3 q = .012 for the PS 
test, and Y ^,^9 = .019, Y^^^^ = .035, and Yj^.= 3 o = .044 for the MWSR test. Similarly, the error rates of the AAF and 
MWSR tests proved to be sensitive to P, producing marginal means of Yp ^3 = .051 and Yp ^4 = .047 for the AAF test 
and Yp =3 ~ .040 and Yp =4 = .026 for the MWSR test. 

Conclusions 

The results of this study suggest that, for the conditions studied, researchers concerned with controlling Type 
I erro.** rates can use either Hotelling's T^ or the F test version of the Akritas and Arnold (1994) rank-transform 
statistic in testing for a main effect in the single-factor repeated measures model. Although the F test version of 
the rank-transform statistic performed well, our preference is for Hotelling's T^ test because of its use of raw scores 
as opposed to ranks and because of its membership in the general linear model family of statistical procedures. 
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The performances of the multivariate Wilcoxon signed-ranks test and the Puri and Sen test were far less 
impressive. Both of these tests produced quite conservative error rates for smaller sample sizes (especially the 
Puri and Sen test) which, other things being equal, would be expected to be associated with depressed power 
values. The chi-square statistic presented in Akritas and Arnold (1994) performed poorly for all ccmditions and 
is not recommended. 
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Table r 

Outline of the Simulation Study 



Independent Variables 



Type of Distribution 


N 


p 


Normal 


9 


3,4 


(Y.=0, Y2=0) 


15 


3,4 




30 


3,4 


Slightly skewed and leptokurtic 


9 


3,4 


(Yi=.5, Y2=T5) 


15 


3,4 




30 


3,4 


Moderately skewed and leptokurtic 


9 


3,4 


(Y.=l/ Y2=3) 


15 


3,4 




30 


3,4 


Skewed and leptokurtic 


9 


3,4 


(Yr-1.5, Y2=4.5) 


15 


3,4 




30 


3,4 


Strongly skewed and leptokurtic 


9 


3,4 


(Y.=2, Y2=6) 


15 


3,4 




30 


3,4 



+Note. Yi = skewness, Y2 = kurtosis, N = sample size, P = number 
of repeated measures. 



Appendix A 
Computing the Tests 

Johnson & Wichem (1988, p. 219) used a dataset involving, measurements of time (in milliseconds) between 
heartbeats, which was measured 4 times for 19 dogs. The raw data were; 

Repeated Measures 



Dog 


1 


2 


3 


4 


1 


426 


609 


556 


600 


2 


253 


236 


392 


395 


3 


359 


433 


349 


357 


4 


432 


431 


522 


600 


S 


405 


426 


513 


513 


6 


324 


438 


507 


539 


7 


310 


312 


410 


456 


8 


326 


326 


350 


504 


9 


375 


447 


547 


548 


10 


286 


286 


403 


422 


11 


349 


382 


473 


497 


12 


429 


410 


488 


547 


13 


348 


377 


447 


514 


14 


412 


473 


472 


446 


15 


347 


326 


455 


468 


16 


434 


458 


637 


524 


17 


364 


367 


432 


469 


18 


420 


395 


508 


531 


19 


397 


556 


645 


625 



Hotelling's T^ Test 



368.21 




2819.29 


404.63 


s = 


3568.42 7963.14 


479.26 




2943.49 5303.98 6851.32 


502.89 




927.62 914.54 7557.44 



Y is a P X 1 vector of .ample means and S a P x P covariance matrix. The hypothesis to be tested is Ho: Xj = x. 
= X 3 = X 4 = 0, with N = 19 and P = 4. Johnson and Wichem transformed the P repeated measures into P-1 new 
variables that contained all the between-measure information in the origir il variables. Any number of 
transformations will do; we follow Johnson and Wichem and use: 

- 1-1 11 
C = 1-1 1-1 

1 - 1-11 



The sample means arc transformed directly: 

209.31 
CY =: -60.05 

-12.79 
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For the covariance matrix S, 



9432.32 

CSC' = 1098.92 5195.84 

927.62 914.54 7557.44 

Then 

T' = N(CY)'(CSC)-‘(CY) = 19(6.11) = 116. 



As an F, 

F = (N-P+1) = (.2963)(116) = 34.37. 

(P-1) (N-1) 



Akritas and Arnold Rank-Transform Test (AACHI) 
Ranked data 



34.5 


73 


69.5 


71.5 


2 


1 


23 


24.5 


17 


40 


13.5 


16 


38.5 


37 


62 


71.5 


28 


34.5 


59.5 


59.5 


7 


42 


57 


65 


5 


6 


29.5 


47 


9 


9 


15 


56 


20 


44.5 


66.5 


68 


3.5 


3.5 


27 


33 


13.5 


22 


52.5 


55 


36 


29.5 


54 


66.5 


12 


21 


44.5 


61 


31 


52.5 


51 


43 


11 


9 


46 


49 


41 


48 


75 


63 


18 


19 


38.5 


50 


32 


24.5 


58 


64 


26 


69.5 


76 


74 


The 


vector 


of rank mt 



20.26 


162.253 


R = 30.82 


182.022 447.323 


48.32 


172.476 292.628 371.333 


54.61 


113.875 172.472 253.796 261.792 


First compute 


TV = N(CR)'(CS„„,C')-'(CR) 


= 119.323, 
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AACHI = (T^rt (N-P+1)/(N-1))/(P-1) := ((119.323)(.0889)}/(3) = 35.35. 

A'.^ an F, 

AAF = AACHP(P-l) = 35.35*3 = 106.05 
Puri and Sen Test (PS) 

Create F-1 difference variables via the transformation CY', where Y is an N x P matrix of the raw scores. Tlie 
resulting difference variable scores are: 



Subject d 



Difference Variables 
d, 



d. 



1 

2 

3 

4 

5 

6 

7 

8 

9 

10 
11 
12 

13 

14 

15 

16 

17 

18 
19 



121 (40) -227 (1) -139 (5.5) 

298 (56) 14 (28) 20 (30) 

-86 (10) -82 (11.5) -66 (16) 

259 (52) -77 (13) 79 (38) 

195 (43) -21 (22.5) -21 (22.5) 

284 (55) -146 (4) -82 (11.5) 

244 (49) -48 (18) 44 (35) 

202 (45) -154 (3) 154 (41) 

273 (54) -73 (14) -71 (15) 

253 (51) -19 (24) 19 (29) 

239 (48) -57 (17) -9 (25) 

196 (44) -40 (19.5) 78 (37) 

236 (47) -96 (8) 38 (34) 

33 (31) -35 (21) -87 (9) 

250 (50) 8 (27) 34 (32.5) 

269 (53) 89 (39) -137 (7) 

170 (42) -40 (19.5) 34 (32.5) 

224 (46) 2 (26) 48 (36) 

317 (57) -139 (5.5) -179 (2) 



The value in parentheses are ranks. The vector of rank means is 



45.95 



116.208 

Zd = 14.28 99.202 

14.149 17.472 159.264 



Rd = 16.92 

24.13 



Solving the Pillai-Bartlett eigenvalue problem produces 0 = .96, so 



PS = (N-1)0 = (18)(.96) = 17.23 
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Multivariate Signed-Ranks Wilcoxon Test (MWSR) 



The test statistic is MWSR = T'(V)’^T, where T is a P-1 vector of Wilcoxo signed-rank statistics divided by (N+l) 
and V is the covariance matrix among these statistics. First the univa.iate Wilcoxon signed-rank statistic is 
computed for each of the P-1 difference variables, divided by (N+1), and stored in T: 

9.4 

T = 1 

4.15 

To compute V we first compute the main diagonal elements, which are simply = N(2N+1)/(6(N+1)) = 6.175. 
The covariances are computed by adding the cross-product of the signed-ranks and dividing by (N+1)". Here 



V = 



6.175 

-.354 

-.323 



-.354 -.323 

6.175 .618 

.618 6.175 



Then MWSR = 18. 



